InfoVis 2003 Contest – Zoomology   Entry

Jin Young Hong, Mark Richman, Jonathan D'Andries, Maryann Westfall          

hongjy@cc.gatech.edu marktruly@lycos.com, gte498e@prism.gatech.edu, maryannwestfall@mindspring.com

Georgia Institute of Technology

Ratings used below: (Strength,Possible,Difficult,Not Available)

Pairwise comparisons of trees: Topological changes

Did anything change, in general, or in a subtree?

Rating: Strength

Process: The overview of both trees is always shown on the display with changed areas highlighted in white. Also, switching to the node-link overview highlights nodes in red that only belong to the left dataset and in blue for those only in the right dataset.

Image:

Answer: About 90% of the trees are the same.

What nodes were added, deleted?

Rating: Possible

Process:  Look up a node using a text query, by selecting it in the overview, or by zooming through the tree in the detail window. Click on the colored box to the left of the node’s name to locate it in the other tree. Compare the paths drawn on each side of the legend to see whether additional nodes exist in the path of one tree or the other.

Image:

   

Answer: Zooming in on Mammalia in the right-hand window, we located Monotremata. Clicking on its label located the node in the left-hand tree. The legend located between the detail windows shows the path to the node in each tree marked in yellow. The paths differ as the left-hand dataset includes a Subclass not present in the right-hand tree.

Did any node or subtrees "move" in the tree? Can you characterize those movements?

Rating: Possible

Process:

By discovery: Link the detail windows. Zoom into a node. Unlink the windows. Click on the colored box to the left of the node’s name to locate it in the other tree. If the other window does not change, the nodes are in the same location in both trees.

By zoom: Unlink the windows and locate a node in one tree. Click on the colored box to the left of the node’s name to locate it in the other tree. Link the windows and click the node again. If the other window does not change, the nodes are in the same location in both trees.

Other: Locate a node in both trees by clicking its label as above. If the windows are in different areas, zoom out in each window until both windows show a common parent. The nodes traversed upwards to this point are the differences in the paths.

A subtree can also be identified in this way since nodes may share common ancestors and descendants despite residing and different locations.

Image:

Answer: Zooming in to Diprotodontia in one tree, we clicked on its label to locate it in the other tree. The paths show adjacent to the legend show that the ancestries are considerably different.

Pairwise comparisons of trees: Attribute value changes

Global impression: did things change a lot or not?

Rating: Strength

Process: Change is represented by white area within our overview. Image:

Answer: Only a small area is marked in white. The trees are about 90% the same.

 

What nodes or subtrees changed the most?

Rating: Not Available

Process: Our model can identify change, not how much things have changed.

Image:

Answer: N/A

Did the value of attribute XYZ for this node increase or decrease? In absolute terms, or relatively to other siblings or other nodes.

Rating: Not Available 

Process: Only categorical changes are represented, not quantitative change. Changes in rank are easily identified by changes in color. The user can compare the relative rankings of two nodes by checking the color of each against the legend, which represents the hierarchy of ranks.

Image:

Answer:

General visualization of trees: Topology

Overall characteristics: How large is the tree? How many levels deep? What is the deepest branch? Does the depth vary between subtrees or not?

Rating: Strength

Process: The overview maps all levels of the tree from top to bottom. Counting the levels of the deepest branch finds the tree’s depth and the overview shows how the depth varies between different portions of the tree. The number to the right of the node name indicates the number of its descendants. Thus, we can tell the number of nodes for any subtree, as well as for each tree’s entire Kingdom.

Image:

Answer: From the image associated with question #1, we see that tree-A contains 190,263 nodes beneath Kingdom; tree-B 189.538. This picture shows 3,023 Mammals in tree-A and 6,052 in the other.

Path: What is the path of this node?

Rating: Possible

Process: The path of rankings to a node is provided in the legend. However, the names of its ancestors are not available. The path to the node is displayed in the overview and the user can mouse over the various nodes in the path and learn the name of all ancestors.

Image:

Answer:

Local relatives: What are the children, siblings, or cousins of this node?

Rating: Strength

Process: Zooming in on a node will locate its children. Clicking the center mouse button in the window will pan to uncover its siblings. To locate its cousins, place the cursor on a sibling and right-click to zoom out one level and view its parent. Pan to each sibling of the parent and zoom in one level to locate each set of cousins.

Image:

   

Answer: The image above demonstrates pan within a window.

Filtering by level: Show only the first level, or show only 3 levels down, or remove all the leaves

Rating: Not Available

Process: Zoomology can easily show the first level for the entire dataset or any part. Simply click on the top node in the overview to show the first level in both detail windows. Zooming in on any node for two levels more will show the third level for a particular branch of the tree. However, we cannot show the entire tree’s third level in detail. In the overview, each level can be identified by its position from the top. Although each individual node is rendered at the upper levels of the overview, the lower levels aggregate nodes into a single pixel.

Image:

Answer:

Topologies question that involve counting nodes can be seen as attribute dependant questions: e.g. Which branch contains the largest number of nodes? or Which branch has the largest fan-out?

Rating: Possible 

Process: The width of a node in the overview represents the number of its descendants. The user can compare branches starting at any of the top few levels at a glance to determine which is largest. Zoomology cannot tell which branch has the largest fan-out.

Image:

Answer:

General visualization of trees: Attribute based

Find nodes with high values of a numerical attribute X? (relative query)

Rating:

Not Available

Process:

            The tool does not support numeric attributes.

Image:

Answer:

Find nodes with given value of a numerical attribute X? (absolute query)

Rating:

Not Available

Process:

            The tool does not support numeric attributes.

Image:

 

Answer:

Find nodes with value Y of categorical attribute X - What value of a categorical attribute occurs more often? e.g. Are there more farm animals or pets?

Rating: Possible

Process: The only categorical attribute Zoomology supports is ranking. In the top levels of the overview instances of a particular rank can be identified by color. A color key organized hierarchically by rank is provided in the legend. While zooming into the detail view, nodes of a rank can easily be identified by color. A rough idea of the relative occurrence of rank is provided in the overview but the lower levels of the overview aggregate nodes into a single pixel and do not accurately represent their number.

 

Image:

Answer:

Find nodes with certain values of two or more attributes (What video file is used the most?)

Rating:

Not Available

Process:

Image:

Answer:

Number of nodes in a tree or subtree? (How many animals? How many mammals?)

Rating: Strength

Process: Click on the overview or zoom in either of the detail windows to the node. The number of nodes contained beneath it is part of the label. This number is a sum of all children below this node, not including the node itself.

Image:

Answer:

Tree-A animals: 190,263

Tree-B animals: 189,538

Tree-A mammals: 3,022

Tree-B mammals: 6,052

Comparison of branches of the tree (Subtrees with most nodes; are there more mammals or fish?)

Rating: Strength

Process: The count of descendants is included in the node’s label in the detail window.

Image:

Answer: In the image above we have panned one window to show the number of fish and the other to show the number of mammals.

Both trees show 13,180 types of fish, significantly larger than the number of mammals noted in the previous question. Eat more fish!

Largest fanout (What is the largest group of animals with same lineage?)

Rating: Strength

Process:  In the overview all nodes are allocated space based on the number of nodes underneath them in the tree. This is recursively summed, so the size of a parent node is the sum of the sizes of its children. Thus, large groups are reflected all the way up the tree. For example, a large order will take up much of the space beneath a large class, which might occupy most of the space underneath a large phylum.

Image:

Answer: Arthropoda is the biggest phylum, partly because Hexapoda is the biggest subphylum (followed closely by Vertebrata underneath Chordata). Diptera’s label is shown in the overview. Following the arthropods (large pink area in the overview), we get a large Insecta class, containing a large Pterygota subclass, containing a large Neoptera superorder, containing the three fairly big orders Diptera, Hymenoptera, and Trichoptera. This analysis may continue as you page down the tree.

General visualization of trees: Known items

Which nodes have a particular string in their label? (Find "giraffe" in a tree of animals)

Rating: Strength

Process: Perform a text search on the name specifying Latin Name, Common Name, or both.

Image:

Answer: The giraffe and giraffe seahorse are shown in different detail windows.

Locate a node knowing its path

Rating: Strength

Process: Zoom through the detail windows or click on the overview.

Image:

Answer:

Go back to a node you have visited before

Rating: Possible

Process: If you know the name of the node, perform a text search. We do not explicitly store navigation history, but you can always navigate back or search for the name.

Image:

Answer:

General visualization of trees: Labeling

Review all the labels in a subtree

Rating: Possible

Process: In the detail windows you can easily view all the labels of the immediate children of a node, but you may have to zoom into the individual children to collect the labels at further depth. In the overview, Zoomology provides immediate access to all labels by rollover with the mouse. The level of granularity determines how accurately the label maps to its data point—nodes at the higher levels of the tree are easily distinguished whereas many species and subspecies may be clustered on a single pixel at the lower levels.

Image:

Answer: It is easy to distinguish the labels of the 12 subphylum nodes but overloading of pixels makes the 26,000 genus labels impossible to show accurately.

General visualization of trees: Browsing

Explore the tree by performing a series of up and downs in the tree

Rating: Strength

Process: Unlinking the detail windows will allow viewing of an individual tree. Link the windows or click on the overview for smooth navigation through both trees at once. The paths maintained on the legend and in the overview provide context.

Image:

Answer:

General visualization of trees: Managing the analysis

Marking nodes of interest

Rating: Difficult

Process: Which searching or browsing, active selections are automatically marked by the system in the paths paths maintained on the legend and in the overview. Clicking on the higher levels of the path shown in the overview zooms the detail window to that point.

Image:

Answer:

Removing special anomalies

Rating: Not Available

Process:

Image:

Answer:

Saving visualization settings for future reference

Rating: Not Applicable

Process: There are only user-specifiable settings: the choice of a space-filling or node-link overview and the choice of linking or unlinking detail windows.

Image:

Answer:

Keeping the history of your analysis, reviewing it and replaying it with different parameters

Rating: Not Available

Process:

Image:

Answer:

Phylogenies: Application specific tasks

The higher-level problem is to find the best way to map the similarities between the two trees topologies, which would indicate co-evolution, and, maybe, the point(s) where the two proteins were not co-evolving. Is there Co-evolution?

Rating:

< One of Strength,Possible,Difficult,Not Available>

Process:

Image:

Answer:

Interacting with the tree matching process to solve inconsistencies

Rating:

Not Available

Process:

Image:

Answer:

Displaying the trees, with or without taking into account the branch length (the length of the links)

Rating:

< One of Strength,Possible,Difficult,Not Available>

Process:

Image:

Answer:

Showing the relationships and differences from a computed or interactively constructed mapping

Rating:

Strength

Process:

Image:

Answer:

Providing ways to permute links and nodes to verify hypotheses interactively

Rating:

< One of Strength,Possible,Difficult,Not Available>

Process:

Image:

Answer:

Classifications: Application specific tasks

To what extent are the differences in the classifications due to differences in how animals are thought to be related? Are there other kinds of differences and can you explain them?

Rating: Strength

Process: The path of each node is shown in the overview as the node is displayed in a detail window. Zooming up the tree in the detail window or performing  mouse overs up to the top level in the overview reveals the name of each ancestor.

Image:

Answer: The path to Muscomorpha is shown by circles in the overview; the path to Teleostei is represented by triangles, showing a large difference in genealogy.

 

Can you say in how many different subtrees a particular common name (such as "dolphin" or "horse") is used? How closely are these animals related? Are common names a good guide to understanding relationships?

Rating: Possible

Process: Perform a text search on the name. A window containing all occurrences of the name is returned. Black areas in the overview show the various places at which these nodes occur within the trees.

Image:

Answer: A search on “dolphin” in both Latin and Common name returned 97 items. Black areas in the overview show a wide dispersal of these over the tree, implying that many “dolphins” are unrelated. Clicking on a node name in the returned text box will zoom to that node.

How many species or subspecies are named after biologists named "Townsend"?

Rating: Strength

Process: Text search on Townsend.

Image:

Answer: 55 names were found, scattered throughout the tree. Note that if a node exists in different locations in the two trees, two names are returned.

What kind of feedback does your tool provide to alert the user quickly when a wrong name is entered?

Rating: Possible

Process: After searching with the search tool, click on the nodes name to zoom to it in the detail window. A node’s ranking is shown in its label and context is provided by the names of its siblings in the detail window, as well as by the paths represented in the overview and legend. Zoomology also provides an Alphaslider which shows the names of all nodes. However, it may take a while to pinpoint a single name out of 190,000.

Image:

Answer:

For the top five subtrees with the most nodes-- are they likely to have a parent of a particular rank? Or does this happen in many ranks? Can you comment on how useful "rank" is?

Rating: Possible

Process: The largest subtrees are easily discerned as those with the largest width in the overview. The descendants of Arthropoda comprise the four largest subtrees in the files.

Image:

Answer:

File system and usage logs: Application specific tasks

Where are the big directories?

Rating:

< One of Strength,Possible,Difficult,Not Available>

Process:

Image:

Answer:

Can you see different patterns in the files? (Can you make out the difference between personal pages, class pages and research project pages?)

Rating:

Strength

Process:

Image:

Answer:

Were there a lot of pages created recently? If so, in which part of the file system?

Rating:

< One of Strength,Possible,Difficult,Not Available>

Process:

Image:

Answer:

Are the newer directories bigger than the older projects?

Rating:

Difficult

Process:

Image:

Answer:

When was the page giving directions to the department last updated?

Rating:

< One of Strength,Possible,Difficult,Not Available>

Process:

Image:

Answer:

Which are the popular webpages?

Rating:

Not Available

Process:

Image:

Answer:

Are there some labs more popular than others?

Rating:

< One of Strength,Possible,Difficult,Not Available>

Process:

Image:

Answer:

Which areas are getting more popular? Less popular?

Rating:

Not Available

Process:

Image:

Answer:

Are new pages more popular that old pages?

Rating:

< One of Strength,Possible,Difficult,Not Available>

Process:

Image:

Answer:

Which old pages are popular?

Rating:

Not Available

Process:

Image:

Answer:

What proportion of the pages are never used?

Rating:

< One of Strength,Possible,Difficult,Not Available>

Process:

Image:

Answer:

What proportion of the pages are seldom used?

Rating:

Not Available

Process:

Image:

Answer:

Other Strengths of the System

Exploration

Rating: Strength

Process: Get Down and Zoom!

Image:

Answer: Zoomology promotes ad-hoc exploration of the differences between large datasets. Color coding easily identifies the ranks of nodes and white areas clearly mark change. Text search and clickable areas in the overview and detail windows assist the user in locating animals that reside in different places in each tree. The smooth zoom animation and use of color make Zoomology a fun, exciting tool to use.